A-Learning for Approximate Planning

نویسندگان

  • D. Blatt
  • S. A. Murphy
چکیده

Abstract We consider a new algorithm for reinforcement learning called A-learning. A-learning learns the advantages from a single training set. We compare A-learning with function approximation to Q-learning with function approximation and find that because A-learning approximates only the advantages it is less likely to exhibit bias due to the function approximation as compared to Q-learning.We consider a new algorithm for reinforcement learning called A-learning. A-learning learns the advantages from a single training set. We compare A-learning with function approximation to Q-learning with function approximation and find that because A-learning approximates only the advantages it is less likely to exhibit bias due to the function approximation as compared to Q-learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A FUZZY MINIMUM RISK MODEL FOR THE RAILWAY TRANSPORTATION PLANNING PROBLEM

The railway transportation planning under the fuzzy environment is investigated in this paper. As a main result, a new modeling method, called minimum risk chance-constrained model, is presented based on the credibility measure. For the convenience ofs olving the mathematical model, the crisp equivalents ofc hance functions are analyzed under the condition that the involved fuzzy parameter...

متن کامل

Model-based Reinforcement Learning in Modified Lévy Jump-Diffusion Markov Decision Model and Its Financial Applications

This thesis intends to address an important cause of the 2007-2008 financial crisis by incorporating prediction on asset pricing jumps in asset pricing models, the non-normality of asset returns. Several different machine learning techniques, including the Unscented Kalman Filter and Approximate Planning are used, and an improvement in Approximate Planning is developed to improve algorithm time...

متن کامل

A model-based approximate λ-policy iteration approach to online evasive path planning and the video game Ms. Pac-Man

This paper presents a model-based approximate λ-policy iteration approach using temporal differences for optimizing paths online for a pursuit-evasion problem, where an agent must visit several target positions within a region of interest while simultaneously avoiding one or more actively pursuing adversaries. This method is relevant to applications, such as robotic path planning, mobile-sensor...

متن کامل

Learning Heuristic Functions through Approximate Linear Programming

Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuristic is not always available. We propose a new approach to learning heuristic functions from previously solved problem instances in a given domain. Our approach is based on approximate linear programming, commonly used ...

متن کامل

The Effects of Educational Planning on Learning of Occupational Health Students

Background: Presentation of course plan were advised but presentations of  lesson plans were useful too. These items can introduce the course and lessons of each session to students.  The objective of this study was the determination of the effects of educational planning on learning of occupational diseases. Methods: This study was a semi experimental study which was conducted by using the cur...

متن کامل

Learning to Rank for Synthesizing Planning Heuristics

We investigate learning heuristics for domainspecific planning. Prior work framed learning a heuristic as an ordinary regression problem. However, in a greedy best-first search, the ordering of states induced by a heuristic is more indicative of the resulting planner’s performance than mean squared error. Thus, we instead frame learning a heuristic as a learning to rank problem which we solve u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004